Learning Partially Observable Action Models: Efficient Algorithms
نویسندگان
چکیده
We present tractable, exact algorithms for learning actions’ effects and preconditions in partially observable domains. Our algorithms maintain a propositional logical representation of the set of possible action models after each observation and action execution. The algorithms perform exact learning of preconditions and effects in any deterministic action domain. This includes STRIPS actions and actions with conditional effects. In contrast, previous algorithms rely on approximations to achieve tractability, and do not supply approximation guarantees. Our algorithms take time and space that are polynomial in the number of domain features, and can maintain a representation that stays compact indefinitely. Our experimental results show that we can learn efficiently and practically in domains that contain over 1000’s of features (more than 2 states).
منابع مشابه
Learning Partially Observable Action Models
In this paper we present tractable algorithms for learning a logical model of actions’ effects and preconditions in deterministic partially observable domains. These algorithms update a representation of the set of possible action models after every observation and action execution. We show that when actions are known to have no conditional effects, then the set of possible action models can be...
متن کاملSpectral Approaches to Learning Predictive Representations
A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time consuming process. This motivates an alternative approach: learning a model directly from observation...
متن کاملReinforcement Learning for Factored
Reinforcement Learning for Factored Markov Decision Processes Brian Sallans Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2002 Learning to act optimally in a complex, dynamic and noisy environment is a hard problem. Various threads of research from reinforcement learning, animal conditioning, operations research, machine learning, statistics and optimal cont...
متن کاملLearning discrete Bayesian models for autonomous agent navigation
Partially observable Markov decision processes (POMDPs) are a convenient representation for reasoning and planning in mobile robot applications. We investigate two algorithms for learning POMDPs from series of observation/action pairs by comparing their performance in fourteen synthetic worlds in conjunction with four planning algorithms. Experimental results suggest that the traditional Baum-W...
متن کاملLearning Partially Observable Deterministic Action Models
We present the first tractable, exact solution for the problem of identifying actions’ effects in partially observable STRIPS domains. Our algorithms resemble Version Spaces and Logical Filtering, and they identify all the models that are consistent with observations. They apply in other deterministic domains (e.g., with conditional effects), but are inexact (may return false positives) or inef...
متن کامل